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Reduction of quality and quantity of agricultural products, particularly 
peanut or groundnut, is usually associated with disease. This could be solved 
through automatic identification and diagnoses using deep learning. 
However, this technology is not yet explored and examined in the case of 
peanut leaf spot disease due to some aspects, such as the availability of 
sufficient data to be used for training and testing the model. This study is 
intended to explore the use of pre-trained visual geometry group—16 
(VGG16), visual geometry group—19 (VGG19), InceptionV3, MobileNet, 
DenseNet, Xception, InceptionResNetV2, and ResNet50 architectures and 
deep learning optimizers such as stochastic gradient descent (SGD) with 
Momentum, adaptive moment estimation (Adam), root mean square 
propagation (RMSProp), and adaptive gradient algorithm (Adagrad) in 
creating a model that can identify leaf spot disease by using a total of 1,000 
images of leaves captured using a mobile camera. Confusion matrix was 
used to assess the accuracy and precision of the results. The result of the 
study shows that DenseNet-169 trained using SGD with momentum, Adam, 
and RMSProp attained the highest accuracy of 98%, while DenseNet-169 
trained using RMSProp achieved the highest precision of 98% among 
pre-trained deep convolutional neural network architectures. Furthermore, 
this result could be beneficial in agricultural automation and disease 
identification systems for peanut or groundnut plants. 
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1. INTRODUCTION 


Peanut (Arachis hypogea L.) is one of the commercially important legumes cultivated by local 
farmers in the Philippines [1]. However, low production is associated with low yielding variety, lack of high- 
yielding adaptive cultivars, pests and diseases, poor agronomic practices, climate change, and limited use of 
inputs [2]. Considering peanut diseases, the most common that affect the production are sclerotina blight, late 
leaf spot, northern root-knot nematode, spottedwilt, stem rot, early leaf spot, rust, web blotch, diplodia collar 
rot, funky or irregular leaf spot, rhizoctonia limb rot, cylindrocladium black rot, aspergillus crown rot, and 
peanut root-knot nematode [3]. Among these diseases, early leaf spot and late leaf spot are the most typical 
peanut diseases due to the warm and humid climate of the country [4]. Leaf spot disease results in increased 
defoliation and yield losses of up to 50% [5]. Reduction of quality and quantity of agricultural products like 
peanut can possibly be solved through early detection and regular monitoring of diseases. Yet, this is difficult 
to be achieved, since detection of peanut diseases such as leaf spot disease is usually done by manually 
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examining and monitoring some changes in morphology or physical traits. Efficiency in doing monitoring is 
a major problem in most areas in the country due to limited manpower to do the task [6]. 

Meanwhile, the issue of early detection and surveillance is commonly treated using deep 
convolutional neural network in plant disease identification because of its nature [7]. The primary purpose of 
a deep convolutional neural network or deep convolutional neural network is to learn data characteristics 
from convolution operations [8]. The complexity of the architecture and its capability to provide a higher 
accuracy model make the deep convolutional neural network, the newest and hottest topic in image 
recognition [9], particularly in disease identification [10]. 

Deep learning convolutional neural networks differ from conventional neural networks (CNN) in the 
sense that they have more nodes and more complex means of layer interconnection [11]. Due to its structure, 
training a deep convolutional neural network requires high computing power [12] and a large amount of data 
in order to attain better results [13]. This technology, however, is not yet used, explored, and examined in the 
case of peanut farming [14], particularly in the identification of peanut leaf spot disease due to aspects such 
as the availability of sufficient data needed to train and test the model [11]. 

The purpose of the study is to explore the use of transfer learning algorithm [15], isolation or 
background elimination [7], and deep learning optimizers in order to address the insufficient data problem in 
creating a model that can identify leaf spot disease [16]. Specifically, it aims to: gather images of peanut or 
groundnut leaves, perform pre-processing of the images, perform transfer learning training on the images 
using different architectures, optimizers, and learning rate to design classifications, and evaluate the models. 
The contributions of this study are the locally collected dataset and the method used in order to achieve the 
best model for peanut leaf spot disease identification. This study also explores the use of comprehensive 
evaluation on the different pre-trained deep CNN architectures, including the application of the different deep 
learning optimizers. The articles are presented as follows: section 2 explains the overall methodology applied 
in the study, the experimental set-up, data gathering, data pre-processing, training, and evaluation; section 3 
presents the results of the study during training and evaluation; and section 4 provides the conclusions and 
recommendations. 


2. THEORETICAL BACKGROUND AND RELATED RESEARCHES 
2.1. Peanut industry of the Philippines 

In the Philippines, the National Capital Region is home to the majority of the country's peanut 
industries, while provincial areas in Luzon and Mindanao are home to the remaining micro-scale producers 
[1]. Despite of rising local and global demand for peanut finished products, local farmers are still unable to 
meet the industry's demands, forcing local peanut product makers to import raw peanuts from other 
countries. Inability of the local farmers to meet the demand is attributed to low yielding variety, lack of 
high-yielding adaptive cultivars, pests and diseases, poor agronomic practices, climate change, and limited 
use of inputs [2]. When it comes to diseases, early and late leaf spot is attributed to defoliation leading to 
yield loss of about 50% and more [3]. This disease is common the country due to its humid weather in most 
parts of the country in specific month of the year [4]. 


2.2. Computer vision and agriculture 

As computational systems developed, application of machine learning to computer vision achieved 
exponential growth leading to the development of novel methodologies and models [17], which now form a 
new category, that of deep learning [18]. These methodologies and models are now used for detection and 
accurate identification of diseases in grain crops which have great importance for their effective management 
in order to guarantee productive and sustainable agriculture [6]. The diagnosis of plant diseases is usually 
performed visually and may present flaws due to its laborious and subjective nature [7]. The study of 
Barbedo [19] suggests a methods of computer vision with artificial intelligence to automate the process of 
detection of diseases in plants. The automatic detection of diseases from images includes, among other 
factors, the determination of the most discriminative characteristics for the efficient recognition of the 
disease. The use of computer vision particularly deep learning in the field of agriculture only began to take 
place in the last couple of years, and to a rather limited extent. It was being utilized to identify diseases in 
rice [20], banana [21], avocado [22], grapes [23], coffee [24], abaca [7], and cassava [12]. 


2.3. Deep convolutional neural network and transfer learning 

The accuracy of the network used to detect diseases relies mainly on the complexity and structure of 
the architecture [11]. Deep learning convolutional neural network has different architectures, each has 
different implementation of the idea of deep convolutional neural network [18]. One of the examples of deep 
convolutional network is Visual Geometry Group network architecture which is distinguished by its 
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simplicity. It uses 3x3 convolutional layers stacked on top of one another in increasing depth with max 
pooling to reduce volume size. It has a deep of 16 and 19 weight layers [25]. Meanwhile, deep residual 
learning framework (ResNet) was developed to address degradation problem due to deeper networks. Instead 
of hoping each few stacked layers directly fit a desired underlying mapping, let these layers fit using a 
residual mapping [26]. Another solution to degradation problem due to deeper network is addressed by 
DenseNet architecture. This architecture simplifies the connectivity pattern and ensures the maximum 
information flow [27]. MobileNet, which is also a deep learning convolutional network architecture, is better 
suited to mobile and embedded vision applications with limited computational resources. When compared to 
a network with conventional convolutions of the same depth in the networks, this design employs depth wise 
separable convolutions, which greatly decreases the number of parameters, resulting in a light weight deep 
neural network [28]. Furthermore, Inception-v3 is a convolutional neural network architecture designed 
based on the Inception group that transports label information lower down the network using label 
smoothing, factorized convolutions, and an auxiliary classifier [29]. Xception architecture, on the other hand, 
is a version of inception architecture that uses depth wise separable convolutions to replace Inception 
modules [30]. 

Structure of the architectures is one of the main concerns because training a deep convolutional 
neural network is difficult to realize since it requires high computing power [12] and a large amount of data 
in order to attain better results [13]. The study of Jiang et al. [13], Ramcharan et al. [12], and Sagar and 
Dheeba [15] explores the use of transfer learning and deep learning optimizers in order to address the 
insufficient data problem and minimal time of training the model. 


3. RESEARCH METHOD 
3.1. Experimental setup 

In conducting the experiment, capturing healthy and leaf spot infected peanut leaves in the sampling 
sites using a mobile camera was done first. The background of the image was also considered as part of the 
experiment. Images were cleaned, pre-processed, and augmented to eliminate duplication, improve the 
quality of the images, and introduce minimal distortion to the images which aids in reducing overfitting at the 
training stage. Cleaned images were considered as a dataset and were subjected to training. 

During training, the dataset was divided into two groups: healthy and infected peanut leaves. 
Weights from ImageNet trained large dataset using VGG16, VGG 19, InceptionV3, MobileNet, DenseNet- 
169, Xception, InceptionResNetV2, and ResNet50 architectures were used for retraining. During retraining, 
deep learning optimizers such as stochastic gradient descent (SGD) with momentum, adaptive moment 
estimation (Adam), root mean square propagation (RMSProp), and adaptive gradient algorithm (Adagrad) 
trained on different learning rate were explored. To give an unbiased evaluation of a model retrained on the 
training dataset, the candidate model was used to predict the responses in the validation dataset. Using the 
test dataset, candidate retrained models were evaluated in order to assess their performance. Each candidate 
model was assessed based on its ability to distinguish between healthy and infected peanut leaves in photos. 
Figure | is the experimental set-up used in the study. 
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Figure 1. Experiment set-up 
3.2. Data gathering 
The sampling site of the study is in Liloy, Zamboanga del Norte. The camera used was RealMi 6i 


mobile phone with 48-megapixel back camera and 16-megapixel front camera. Upon the taking of images, 
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camera was set to manual mode and positioned 0.25 meter above the leaves. In this study, isolation or 
background elimination was also considered [13]. Sample images of healthy leaves without background and 
with background is shown in Figures 2(a) to 2(d). A total of 1,000 images captured from the sampling site 
were validated by experts in plant disease. 


(d) 


(a) (b) 


Figure 2. Sample images of (a) healthy leaves without background, (b) with background elimination, (c) leaf 
spot infected leaves without background, and (d) with background elimination 


3.3. Data pre-processing 

Before training, images were labeled based on two groups: healthy and infected leaves. Anchoring 
to the study of Mohanty et al. [10], the datasets were divided as follows: 600 images (60%) used for training, 
100 images (10%) used for validation, and 100 images (10%) were used for testing. Table 1 is the 
distribution of actual samples based on different factors. Before feeding the images into the network, they 
were resized to 255x255 in order to reduce the training time [7]. Augmentation such as affine transformation 
was also applied in order to increase the dataset and apply slight distortion to the images to reduce over- 
fitting [6]. In this study, rescaling and pixel normalization were also applied. 


Table 1. Distribution of actual samples based on different factors 


Part Type Total Number of Images Training 60% Validation 20% Test 20% 
Healthy With Background 250 images sn) a , 300 , 50 , 50 
eaves Without Background 250 images images images images 
Infected With Background 250 images 500 images 300 50 50 
Without Background 250 images 8 images images images 


3.4. Training of data using transfer learning approach 

Transfer learning is based on the idea of reusing a previously trained model on a similar domain 
rather than retraining a new model from scratch to improve its performance. In this study, pre-trained 
VGG16, VGG 19, InceptionV3, MobileNet, DenseNet-169, Xception, InceptionResNetV2, and ResNet50 
were used for training. The architecture is depicted in Figure 3 and begins with several data augmentation 
approaches. The images used for training and testing are converted to a 1D array and fed to the dense layer 
using the flatten layer. A dropout layer with a dropout rate of 0.7 and a sigmoid activation function was 
added for input classification. Adopting the study of Saleem et al. [16], this study used different deep 
learning optimizers such as stochastic gradient descent (SGD) with Momentum (beta=0.9) [31], adaptive 
moment estimation (Adam), Adagrad, and root mean square propagation (RMSProp) at different learning 
rates [16]. 
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Figure 3. The transfer deep learning approach used in the study 
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3.5. Hardware and software 

To train and test the model, python programming language was employed and coded using Jupyter 
Notebook. Python libraries such as pandas, NumPy, matplotlib.pyplot, math, sklearn, and Keras built on top 
of the tensor flow architecture [7] were utilized. Table 2 is the specification of the hardware used in the 
study. 


Table 2. Hardware specification used in the study 


Parts Specification 
Processor Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz 
Memory (RAM) 8 gigabytes (GB) 
Storage 1 Terabyte (TB) 
Graphics NVIDIA GeForce GTX 1080 Ti graphics card 


3.6. Evaluation 

The typical method for evaluating the performance of artificial neural networks is to divide data into 
three sets: training, validation, and test, and then train a neural network on the training set and use the test set 
for prediction. Because the testing set's actual outcomes and the model's projected outcomes are both known, 
the accuracy of the predictions can be measured. During the testing process, a confusion matrix was used, 
and the different models were evaluated on the basis of accuracy and precision, where the formula is shown 
in (1) and (2), respectively. Accuracy is defined as how often the classifier is correct, while precision is the 
proportions of positive and negative results that are true positive and true negative results. It answers the idea 
that when the classifier predicts yes, how often is it correct. 


TP+TN 
Accuracy = ——————_- (1) 
TP+TN+FP+FN 
soe TP 
Precision = (2) 
TP+FP 


Average or mean from complete training was considered in this study [7]. 


4. RESULTS AND DISCUSSION 

The goal of this study is to create models that will identify the presence of disease and healthy 
leaves. Explore the use of transfer learning algorithm, isolation or background elimination, and deep learning 
optimizers considering the limited data. The succeeding subsections elaborate the results during the training 
and evaluation process. 


4.1. Training 

In this research, weights trained using 10,000+ images from ImageNet were obtained and utilized. 
Prior model training to obtain data visualization was performed to assess the performance of the models 
trained using different deep learning optimizers. The assessment was done since this research only utilized 
small and utterly different dataset from the dataset being used by the pre-trained model. 

In this study, the researcher decided to train the model over 100 epochs. The reason to train each 
model over 100 epochs is for the researcher to babysit the training process and to identify which stage of the 
training process the accuracy and loss start to saddle and to what extent. A saddle point is when the gradient 
reaches a plateau, and the training loss becomes harder to improve. 


4.2. Evaluation 

During testing, the accuracy and precision of the various architectures trained using different deep 
learning optimizers were evaluated. The results were tallied and the average or mean of the 100 epochs 
training was taken into account. The accuracy and precision are shown in section 4.2.1 and 4.2.2. 


4.2.1. Accuracy 

The performance of the different pre-trained deep CNN architectures when trained using different 
deep learning optimizers in terms of accuracy is illustrated in Figure 4. The research revealed that 6 out of 8 
architectures used in the study achieved an accuracy of 90% and above. On the other hand, SGD with 
momentum, Adam and RMSProp deep learning optimizers achieved the top 1, top 2, and top 3 in terms of 
accuracy in most of the pre-trained deep CNN architectures. In general, DenseNet-169 trained using SGD 
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with momentum, Adam, and RMSProp attained the highest accuracy among pre-trained deep CNN 
architectures used in the study. 
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Figure 4. The accuracy of several deep CNN architectures when trained with different deep learning 
optimizers 


4.2.2. Precision 

The performance of the different pre-trained deep CNN architectures when trained using different 
deep learning optimizers in terms of precision is illustrated in Figure 5. The research revealed that 6 of the 8 
architectures used in the study had precision of 90% or higher. Meanwhile, SGD with momentum, Adam and 
RMSProp deep learning optimizers achieved the top 1, top 2, and top 3 in terms of precision in most of the 
pre-trained deep CNN architectures. In general, DenseNet-169 trained using RMSProp attained the highest 
accuracy among pre-trained deep CNN architectures used in the study. 
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Figure 5. The precision of various architectures trained using various deep learning optimizers 


5. CONCLUSION AND RECOMMENDATION 

One of the factors associated with the low production of peanut or groundnut is a disease, 
particularly the early leaf spot and late leaf spot which are common in the Philippines due to warm and 
humid climate. Studies suggest that early detection and appropriate monitoring are the ways that would 
possibly prevent and control the disease. In computer vision deep learning, the deep convolutional neural 
network is becoming the preferred method in disease identification and classification due to its impressive 
performance. The study explores the use of transfer learning algorithm, isolation or background elimination, 
and deep learning optimizers in order to address the insufficient data problem in creating a model that can 
identify leaf spot disease. 
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Using pre-trained VGG16, VGG19, InceptionV3, MobileNet, DenseNet-169, Xception, 
InceptionResNetV2, and ResNet50 architectures retrained using deep learning optimizers, the result shows 
that DenseNet-169 trained using SGD with momentum, Adam, and RMSProp attained the highest accuracy. 
In contrast, DenseNet-169 trained using RMSProp achieved the highest precision among pre-trained deep 
CNN architectures as used in the study. 

Given the findings, the use of a pre-trained deep convolutional neural network or transfer learning 
algorithm, pre-processing techniques, and deep learning optimizers can alleviate problem in data 
insufficiency and attain better accuracy and precession results. In addition, this study could be beneficial in 
agricultural automation, particularly in creating robotic systems and disease identification systems that can 
identify or classify healthy and infected peanut or groundnut plants. Further, it is also recommended to 
increase not only the number of training images, but also the number of classes of the different commercially 
important Philippine crops in particular. Follow up study will also be recommended to further evaluate the 
models when it will be implemented in terms of effectiveness and price in compare with other classification 
algorithms. 
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